dspawpy.analysis package¶
Submodules¶
dspawpy.analysis.aimdtools module¶
- class dspawpy.analysis.aimdtools.MSD(structures: List[Structure], select: str | List[int] = 'all', msd_type='xyz')¶
Bases:
object
- run()¶
- class dspawpy.analysis.aimdtools.RDF(structures: Structure | List[Structure], rmin: float = 0.0, rmax: float = 10.0, ngrid: float = 101, sigma: float = 0.0)¶
Bases:
object
- get_coordination_number(ref_species, species, is_average=True)¶
returns running coordination number
Parameter¶
- ref_species (list of species or just single specie str):
the reference species. The rdfs are calculated with these species at the center
- species (list of species or just single specie str):
the species that we are interested in. The rdfs are calculated on these species.
is_average (bool): whether to take structural average
- rtype:
numpy array
- get_one_rdf(ref_species_index: str | List[str], species_index: str | List[str], index=0)¶
Get the RDF for one structure, indicated by the index of the structure in all structures
- get_rdf(ref_species: str | List[str], species: str | List[str], is_average=True)¶
Wrapper to get the rdf for a given species pair
Parameter¶
- ref_species (list of species or just single specie str):
The reference species. The rdfs are calculated with these species at the center
- species (list of species or just single specie str):
the species that we are interested in. The rdfs are calculated on these species.
- is_average (bool):
whether to take the average over all structures
- returns:
x is the radial points, and rdf is the rdf value.
- rtype:
(x, rdf)
- class dspawpy.analysis.aimdtools.RMSD(structures: List[Structure])¶
Bases:
object
- run(base_index=0)¶
- dspawpy.analysis.aimdtools.get_lagtime_msd(datafile: str | List[str], select: str | List[int] = 'all', msd_type: str = 'xyz', timestep: float = None)¶
计算不同时间步长下的均方位移
- Parameters:
datafile (str or list of str) --
aimd.h5/aimd.json文件路径或包含这些文件的文件夹路径(优先寻找aimd.h5)
写成列表将依次读取数据并合并到一起
例如['aimd1.h5', 'aimd2.h5', '/data/home/my_aimd_task']
select (str or list of int) -- 原子序号列表,原子序号从0开始编号;默认为'all',计算所有原子 暂不支持计算多个元素的MSD
msd_type (str) -- 计算MSD的类型,可选xyz,xy,xz,yz,x,y,z,默认为'xyz',即计算所有分量
timestep (float) -- 相邻结构的时间间隔,单位为fs,默认None,将从datafile中读取,失败则设为1.0fs; 若不为None,则将使用该值计算时间序列
- Returns:
lagtime (np.ndarray) -- 时间序列
result (np.ndarray) -- 均方位移序列
Examples
>>> from dspawpy.analysis.aimdtools import get_lagtime_msd >>> lagtime, msd = get_lagtime_msd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5') Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5... Calculating MSD... >>> lagtime array([0.000e+00, 1.000e+00, 2.000e+00, ..., 1.997e+03, 1.998e+03, 1.999e+03]) >>> msd array([0.00000000e+00, 8.80447550e-02, 1.83768134e-01, ..., 9.66185452e+02, 9.66812755e+02, 9.67357702e+02])
- dspawpy.analysis.aimdtools.get_lagtime_rmsd(datafile: str | List[str], timestep: float = None)¶
- Parameters:
datafile (str or list of str) --
aimd.h5/aimd.json文件路径或包含这些文件的文件夹路径(优先寻找aimd.h5)
写成列表将依次读取数据并合并到一起
例如['aimd1.h5', 'aimd2.h5', '/data/home/my_aimd_task']
timestep (float) -- 相邻结构的时间间隔,单位为fs,默认None,将从datafile中读取,失败则设为1.0fs; 若不为None,则将使用该值计算时间序列
- Returns:
lagtime (numpy.ndarray) -- 时间序列
rmsd (numpy.ndarray) -- 均方根偏差序列
Examples
>>> from dspawpy.analysis.aimdtools import get_lagtime_rmsd >>> lagtime, rmsd = get_lagtime_rmsd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', timestep=0.1) Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5... Calculating RMSD... >>> lagtime array([0.000e+00, 1.000e-01, 2.000e-01, ..., 1.997e+02, 1.998e+02, 1.999e+02]) >>> rmsd array([ 0. , 0.04849931, 0.09181907, ..., 31.09991413, 31.10077061, 31.10237454])
- dspawpy.analysis.aimdtools.get_rs_rdfs(datafile: str | List[str], ele1: str, ele2: str, rmin: float = 0, rmax: float = 10, ngrid: float = 101, sigma: float = 0)¶
计算rdf分布函数
- Parameters:
datafile (str or list of str) --
aimd.h5/aimd.json文件路径或包含这些文件的文件夹路径(优先寻找aimd.h5)
写成列表将依次读取数据并合并到一起
例如['aimd1.h5', 'aimd2.h5', '/data/home/my_aimd_task']
ele1 (list) -- 中心元素
ele2 (list) -- 相邻元素
rmin (float) -- 径向分布最小值,默认为0
rmax (float) -- 径向分布最大值,默认为10
ngrid (int) -- 径向分布网格数,默认为101
sigma (float) -- 平滑参数
- Returns:
r (numpy.ndarray) -- 径向分布网格点
rdf (numpy.ndarray) -- 径向分布函数
Examples
>>> from dspawpy.analysis.aimdtools import get_rs_rdfs >>> rs, rdfs = get_rs_rdfs(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5',ele1='H',ele2='O') Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5... Calculating RDF... >>> rdfs array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 6.90409153e-05, 7.47498606e-04, 2.40555636e-03, 2.57666881e-03, 2.53257125e-03, 2.37964722e-03, 1.80028135e-03, 1.20224160e-03, 7.57927477e-04, 3.41004760e-04, 2.85680407e-04, 1.72795412e-04, 1.49718401e-04, 1.44477999e-04, 2.23702566e-04, 2.11836218e-04, 1.34146448e-04, 1.78839343e-04, 9.16910363e-05, 1.45492991e-05, 3.70494493e-05, 3.72784113e-05, 2.82725837e-05, 3.19750004e-05, 8.46456322e-07, 4.72401470e-06, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00])
- dspawpy.analysis.aimdtools.plot_msd(lagtime: ndarray, result: ndarray, xlim: List[float] = None, ylim: List[float] = None, figname: str = None, show: bool = True, ax=None, **kwargs)¶
AIMD任务完成后,计算均方位移(MSD)
- Parameters:
lagtime (np.ndarray) -- 时间序列
result (np.ndarray) -- 均方位移序列
xlim (list of float) -- x轴的范围,默认为None,自动设置
ylim (list of float) -- y轴的范围,默认为None,自动设置
figname (str) -- 图片名称,默认为None,不保存图片
show (bool) -- 是否显示图片,默认为True
ax (matplotlib axes object) -- 用于将图片绘制到matplotlib的子图上
**kwargs (dict) -- 其他参数,如线条宽度、颜色等,传递给plt.plot函数
- Return type:
MSD分析后的图片
Examples
>>> from dspawpy.analysis.aimdtools import get_lagtime_msd, plot_msd
指定h5文件位置,用 get_lagtime_msd 函数获取数据,select 参数选择第n个原子(不是元素)
>>> lagtime, msd = get_lagtime_msd('/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', select=[0]) Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5... Calculating MSD...
用获取的数据画图并保存
>>> plot_msd(lagtime, msd, figname='/data/home/hzw1002/dspawpy_repo/test/out/MSD.png', show=False) ==> /data/home/hzw1002/dspawpy_repo/test/out/MSD.png <Axes: xlabel='Time (fs)', ylabel='MSD ($Å^2$)'>
- dspawpy.analysis.aimdtools.plot_rdf(rs: ndarray, rdfs: ndarray, ele1: str, ele2: str, xlim: list = None, ylim: list = None, figname: str = None, show: bool = True, ax: Axes = None, **kwargs)¶
AIMD计算后分析rdf并画图
- Parameters:
rs (numpy.ndarray) -- 径向分布网格点
rdfs (numpy.ndarray) -- 径向分布函数
ele1 (list) -- 中心元素
ele2 (list) -- 相邻元素
xlim (list) -- x轴范围,默认为None,即自动设置
ylim (list) -- y轴范围,默认为None,即自动设置
figname (str) -- 图片名称,默认为None,即不保存图片
show (bool) -- 是否显示图片,默认为True
ax (matplotlib.axes.Axes) -- 画图的坐标轴,默认为None,即新建坐标轴
**kwargs (dict) -- 其他参数,如线条宽度、颜色等,传递给plt.plot函数
- Return type:
rdf分析后的图片
Examples
>>> from dspawpy.analysis.aimdtools import get_rs_rdfs, plot_rdf
先获取rs和rdfs数据作为xy轴数据
>>> rs, rdfs = get_rs_rdfs('/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', 'H', 'O', rmax=6) Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5... Calculating RDF...
将xy轴数据传入plot_rdf函数绘图
>>> plot_rdf(rs, rdfs, 'H','O', figname='/data/home/hzw1002/dspawpy_repo/test/out/RDF.png', show=False) ==> /data/home/hzw1002/dspawpy_repo/test/out/RDF.png
- dspawpy.analysis.aimdtools.plot_rmsd(lagtime: ndarray, result: ndarray, xlim: list = None, ylim: list = None, figname: str = None, show: bool = True, ax=None, **kwargs)¶
AIMD计算后分析rmsd并画图
- Parameters:
lagtime -- 时间序列
result -- 均方根偏差序列
xlim (list) -- x轴范围
ylim (list) -- y轴范围
figname (str) -- 图片保存路径
show (bool) -- 是否显示图片
ax (matplotlib.axes._subplots.AxesSubplot) -- 画子图的话,传入子图对象
**kwargs (dict) -- 传入plt.plot的参数
- Return type:
rmsd分析结构的图片
Examples
>>> from dspawpy.analysis.aimdtools import get_lagtime_rmsd, plot_rmsd
timestep 表示时间步长
>>> lagtime, rmsd = get_lagtime_rmsd(datafile='/data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5', timestep=0.1) Reading /data/home/hzw1002/dspawpy_repo/test/2.18/aimd.h5... Calculating RMSD...
直接保存为RMSD.png图片
>>> plot_rmsd(lagtime, rmsd, figname='/data/home/hzw1002/dspawpy_repo/test/out/RMSD.png', show=False) ==> /data/home/hzw1002/dspawpy_repo/test/out/RMSD.png <Axes: xlabel='Time (fs)', ylabel='RMSD (Å)'>
dspawpy.analysis.vacf module¶
This module has not been tested yet, use at your own risk!
- dspawpy.analysis.vacf.mirror(arr, axis=0)¶
Mirror array arr at index 0 along axis. The length of the returned array is 2*arr.shape[axis]-1 .
- dspawpy.analysis.vacf.norm_int(y, x, area=1.0, scale=True, func=<function simps>)¶
Normalize integral area of y(x) to area. :param x: :type x: numpy 1d arrays :param y: :type y: numpy 1d arrays :param area: :type area: float :param scale: Scale x and y to the same order of magnitude before integration.
This may be necessary to avoid numerical trouble if x and y have very different scales.
- Parameters:
func (callable) -- Function to do integration (like scipy.integrate.{simps,trapz,...} Called as
func(y,x)
. Default: simps- Return type:
scaled y
Notes
The argument order y,x might be confusing. x,y would be more natural but we stick to the order used in the scipy.integrate routines.
- dspawpy.analysis.vacf.pad_zeros(arr, axis=0, where='end', nadd=None, upto=None, tonext=None, tonext_min=None)¶
Pad an nd-array with zeros. Default is to append an array of zeros of the same shape as arr to arr's end along axis. :param arr: :type arr: nd array :param axis: :type axis: the axis along which to pad :param where: start ("prepend to array") of axis :type where: string {'end', 'start'}, pad at the end ("append to array") or :param nadd: of an 1d array) :type nadd: number of items to padd (i.e. nadd=3 means padd w/ 3 zeros in case :param upto: :type upto: pad until arr.shape[axis] == upto :param tonext: array has a length of power of two) :type tonext: bool, pad up to the next power of two (pad so that the padded :param tonext_min: power of two for which the resulting array length along axis is at
least tonext_min; the default is tonext_min = arr.shape[axis]
- Parameters:
nadd (Use only one of) --
upto --
tonext. --
- Return type:
padded array
Examples
>>> # 1d >>> pad_zeros(a) array([1, 2, 3, 0, 0, 0]) >>> pad_zeros(a, nadd=3) array([1, 2, 3, 0, 0, 0]) >>> pad_zeros(a, upto=6) array([1, 2, 3, 0, 0, 0]) >>> pad_zeros(a, nadd=1) array([1, 2, 3, 0]) >>> pad_zeros(a, nadd=1, where='start') array([0, 1, 2, 3]) >>> # 2d >>> a=arange(9).reshape(3,3) >>> pad_zeros(a, nadd=1, axis=0) array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 0, 0]]) >>> pad_zeros(a, nadd=1, axis=1) array([[0, 1, 2, 0], [3, 4, 5, 0], [6, 7, 8, 0]]) >>> # up to next power of two >>> 2**arange(10) array([ 1, 2, 4, 8, 16, 32, 64, 128, 256, 512]) >>> pydos.pad_zeros(arange(9), tonext=True).shape (16,)
- dspawpy.analysis.vacf.pdos(vel, dt=1.0, m=None, full_out=False, area=1.0, window=True, npad=None, tonext=False, mirr=False, method='direct')¶
Phonon DOS by FFT of the VACF or direct FFT of atomic velocities.
Integral area is normalized to area. It is possible (and recommended) to zero-padd the velocities (see npad).
- Parameters:
vel (3d array (nstep, natoms, 3)) -- atomic velocities
dt (time step) --
m (1d array (natoms,),) -- atomic mass array, if None then mass=1.0 for all atoms is used
full_out (bool) --
area (float) -- normalize area under frequency-PDOS curve to this value
window (bool) -- use Welch windowing on data before FFT (reduces leaking effect, recommended)
npad ({None, int}) -- method='direct' only: Length of zero padding along axis. npad=None = no padding, npad > 0 = pad by a length of
(nstep-1)*npad
. npad > 5 usually results in sufficient interpolation.tonext (bool) -- method='direct' only: Pad vel with zeros along axis up to the next power of two after the array length determined by npad. This gives you speed, but variable (better) frequency resolution.
mirr (bool) -- method='vacf' only: mirror one-sided VACF at t=0 before fft
- Returns:
if full_out = False -- |
(faxis, pdos)
| faxis : 1d array [1/unit(dt)] | pdos : 1d array, the phonon DOS, normalized to areaif full_out = True -- | if method == 'direct': |
(faxis, pdos, (full_faxis, full_pdos, split_idx))
| if method == 'vavcf': |(faxis, pdos, (full_faxis, full_pdos, split_idx, vacf, fft_vacf))
| fft_vacf : 1d complex array, result of fft(vacf) or fft(mirror(vacf)) | vacf : 1d array, the VACF
Notes
padding (only method='direct'): With npad we pad the velocities vel with
npad*(nstep-1)
zeros along axis (the time axis) before FFT b/c the signal is not periodic. For npad=1, this gives us the exact same spectrum and frequency resolution as withpdos(..., method='vacf',mirr=True)
b/c the array to be fft'ed has length2*nstep-1
along the time axis in both cases (remember that the array length = length of the time axis influences the freq. resolution). FFT is only fast for arrays with length = a power of two. Therefore, you may get very different fft speeds depending on whether2*nstep-1
is a power of two or not (in most cases it won't). Try using tonext but remember that you get another (better) frequency resolution.References
[1] Phys Rev B 47(9) 4863, 1993
See also
pwtools.signal.fftsample()
,pwtools.signal.acorr()
,direct_pdos()
,vacf_pdos()
- dspawpy.analysis.vacf.slicetake(a, sl, axis=None, copy=False)¶
The equivalent of numpy.take(a, ..., axis=<axis>), but accepts slice objects instead of an index array. Also by default, it returns a view and no copy. :param a: :type a: numpy ndarray :param sl:
- axis=<int>
one slice object for that axis
- axis=None
sl is a list or tuple of slice objects, one for each axis. It must index the whole array, i.e. len(sl) == len(a.shape).
- Parameters:
axis ({None, int}) --
copy (bool, return a copy instead of a view) --
- Return type:
A view into a or copy of a slice of a.
Examples
>>> from numpy import s_ >>> a = np.random.rand(20,20,20) >>> b1 = a[:,:,10:] >>> # single slice for axis 2 >>> b2 = slicetake(a, s_[10:], axis=2) >>> # tuple of slice objects >>> b3 = slicetake(a, s_[:,:,10:]) >>> (b2 == b1).all() True >>> (b3 == b1).all() True >>> # simple extraction too, sl = integer >>> (a[...,5] == slicetake(a, 5, axis=-1)) True
- dspawpy.analysis.vacf.sum(arr, axis=None, keepdims=False, **kwds)¶
This numpy.sum() with some features implemented which can be found in numpy v1.7 and later: axis can be a tuple to select arbitrary axes to sum over. We also have a keepdims keyword, which however works completely different from numpy. Docstrings shamelessly stolen from numpy and adapted here and there. :param arr: :type arr: nd array :param axis: Axis or axes along which a sum is performed. The default (axis =
None) is to perform a sum over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis. If this is a tuple of ints, a sum is performed on multiple axes, instead of a single axis or all the axes as before.
- Parameters:
keepdims (bool, optional) -- If this is set to True, the axes from axis are left in the result and the reduction (sum) is performed for all remaining axes. Therefore, it reverses the axis to be summed over.
**kwds (passed to np.sum().) --
Examples
>>> a=rand(2,3,4) >>> num.sum(a) 12.073636268676152 >>> a.sum() 12.073636268676152 >>> num.sum(a, axis=1).shape (2, 4) >>> num.sum(a, axis=(1,)).shape (2, 4) >>> # same as axis=1, i.e. it inverts the axis over which we sum >>> num.sum(a, axis=(0,2), keepdims=True).shape (2, 4) >>> # numpy's keepdims has another meaning: it leave the summed axis (0,2) >>> # as dimension of size 1 to allow broadcasting >>> numpy.sum(a, axis=(0,2), keepdims=True).shape (1, 3, 1) >>> num.sum(a, axis=(1,)) - num.sum(a, axis=1) array([[ 0., 0., 0., 0.], [ 0., 0., 0., 0.]]) >>> num.sum(a, axis=(0,2)).shape (3,) >>> num.sum(a, axis=(0,2)) - a.sum(axis=0).sum(axis=1) array([ 0., 0., 0.])
- dspawpy.analysis.vacf.vacf(vel, m=None, method=3)¶
Reference implementation for calculating the VACF of velocities in 3d array vel. This is slow. Use for debugging only. For production, use fvacf().
- Parameters:
vel (3d array, (nstep, natoms, 3)) -- Atomic velocities.
m (1d array (natoms,)) -- Atomic masses.
method (int) --
1 : 3 loops2 : replace 1 inner loop3 : replace 2 inner loops
- Returns:
c -- VACF
- Return type:
1d array (nstep,)
- dspawpy.analysis.vacf.welch(M, sym=1)¶
Welch window. Function skeleton shamelessly stolen from scipy.signal.bartlett() and others.